{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!floyd run --data samlynnevans/datasets/translationdata/1:trans_data --gpu"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Running The Transformer on FloydHub\n",
    "\n",
    "Here you have all the code and computing power required to perform seq2seq neural machine translation using the transformer model.\n",
    "\n",
    "## Using the sample data provided\n",
    "\n",
    "Having an initial play with the model couldn't be simpler; click on the first line of this notebook and hit shift+enter.\n",
    "\n",
    "The model will then begin to train on the sample data of 150,000 French-English sentence pairs and run for 10 epochs. This should be long enough to build a simple translator.\n",
    "\n",
    "As the model trains, weights will be saved automatically every 15 minutes and at the end of each epoch. You need to wait for the loss to reach under 1.5 to start seeing some results.\n",
    "\n",
    "## Testing the model\n",
    "\n",
    "When you have finished training, you can download these weights into this Workspace by opening a terminal and writing this command:\n",
    "\n",
    "`floyd data clone JOBNAME`\n",
    "    \n",
    "The name of your job will be written above (if you've run the first command) after where it says 'floyd logs'. It will be formatted like this `USERNAME/projects/PROJECTNAME/JOBNUMBER`\n",
    "\n",
    "You can then test the model by opening the terminal and running this command to setup the environment:</br>\n",
    "\n",
    "`spacy download en && spacy download fr`\n",
    "\n",
    "Then this command to test the model:</br>\n",
    "\n",
    "`python translate.py -load_weights weights -src_lang en -trg_lang fr -floyd -no_cuda`\n",
    "\n",
    "\n",
    "## Training further\n",
    "\n",
    "if you'd like to train the model for more epochs with the saved weights, you will need to mount them by editing the floyd run command like so:\n",
    "\n",
    "`!floyd run --data samlynnevans/datasets/translationdata/1:trans_data --data USERNAME/projects/PROJECTNAME/JOBNUMBER/weights:weights --gpu`\n",
    "\n",
    "You will then need to open the floyd.yml file and add ` -load_weights /floyd/input/weights/weights` to the command line at the end.\n",
    "\n",
    "\n",
    "## Using your own data\n",
    "\n",
    "To run this model with your own data, you will firstly need to create two text files; one for each language, where each line in one file contains the translation of the corresponding line in the other file.\n",
    "\n",
    "Upload your own data to floydhub following the steps shown here https://docs.floydhub.com/guides/create_and_upload_dataset/\n",
    "\n",
    "Then edit the first line of this jupyter notebook to:\n",
    "\n",
    "`!floyd run --data USERNAME/datasets/DIRNAME/1:trans_data --gpu`\n",
    "\n",
    "You now need to let the model know which langauges you are translating. The supported langauges and their codes are {English : 'en', French : 'fr', Portugese : 'pt', Italian : 'it', Dutch : 'nl', Spanish : 'es', German : 'de'}.\n",
    "\n",
    "Open the floyd.yml file and change the command line so that you have `spacy download SRC_LANGUAGE_CODE && spacydownload TRG_LANGUAGE_CODE`. In the same line, you will also need to change the flags -src_lang en -trg_lang fr to the appropriate spacy codes, and change `-src_data /floyd/input/trans_data/english.txt -trg_data /floyd/input/trans_data/english.txt` to the file names you uploaed.\n",
    "\n",
    "Then click the first line of this notebook, and hit shift + enter to begin training!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
